This data set contains multi-speaker high quality transcribed audio data for Sinhalese. The data set consists of wave files, and a TSV file. The file si_lk.lines.txt contains a FileID, which in tern contains the UserID and the Transcription of audio in the file.
<p>
The data set has been manually quality checked, but there might still be errors.
<p>
This dataset was collected by Google in Sri Lanka.
<p>
See LICENSE.txt file for license information.
<p>
Copyright 2015, 2016 Google, Inc.
